Value set: lift offset from numeric constants to expressions #8647

tautschnig · 2025-05-30T13:30:20Z

We can safely track arbitrary expressions as pointer offsets rather than limit ourselves to just constant offsets (and then treating all other expressions as "unknown").

Each commit message has a non-empty body, explaining why the change was made.
n/a Methods or procedures I have added are documented, following the guidelines provided in CODING_STANDARD.md.
n/a The feature or user visible behaviour I have added or modified has been documented in the User Guide in doc/cprover-manual/
Regression or unit tests are included, or existing tests cover the modified code (in this case I have detailed which ones those are in the commit message).
n/a My commit message includes data points confirming performance improvements (if claimed).
My PR is restricted to a single feature or bugfix.
n/a White-space or formatting changes outside the feature-related changed lines are in commits of their own.

codecov · 2025-06-03T09:11:52Z

Codecov Report

Attention: Patch coverage is 73.68421% with 15 lines in your changes missing coverage. Please review.

Project coverage is 80.36%. Comparing base (eef9677) to head (96844b5).

Files with missing lines	Patch %	Lines
src/pointer-analysis/value_set.cpp	72.72%	15 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #8647      +/-   ##
===========================================
- Coverage    80.36%   80.36%   -0.01%     
===========================================
  Files         1688     1688              
  Lines       207067   207073       +6     
  Branches        73       73              
===========================================
- Hits        166418   166414       -4     
- Misses       40649    40659      +10

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

src/goto-symex/goto_symex_state.cpp

remi-delmas-3000 · 2025-06-18T19:47:08Z

src/goto-symex/shadow_memory_util.cpp

@@ -981,7 +981,7 @@ normalize(const object_descriptor_exprt &expr, const namespacet &ns)
  {
    return expr;
  }
-  if(expr.offset().id() == ID_unknown)
+  if(!expr.offset().is_constant())


Can we get a high level description of what is the normal form we're trying to reach ?

To root object + constant offset, as object can be an arbitrary access path into the object. Pointer equality checks become trivial then - maybe simplify_expr has become good enough in the meanwhile.
Anyways, this seems orthogonal to this PR.

remi-delmas-3000 · 2025-06-18T19:52:21Z

src/pointer-analysis/value_set.cpp

@@ -184,7 +183,7 @@ void value_sett::output(std::ostream &out, const std::string &indent) const
        stream << "<" << format(o) << ", ";

        if(o_it->second)
-          stream << *o_it->second;
+          stream << format(*o_it->second);


now we have to print an expression instead of a mere integer

Yes, but why is that a concern?

remi-delmas-3000

I have one remaining question: Now that we have symbolic offsets for pointer expressions instead of just "constants" or "unknown", is there ay way to use that to compute more precise results in get_value_set_rec ? I know it lets us be more precise when modelling assignments, but I don't understand why we don't have a similar gain in precision when computing dereferences/traversing value_sets.

Other question: now that the value set representation "knows" that an expression array[i] has an offset of the form i * sizeof(T) could we try to take into account extra constraints about i during the value set traversal ? Let's say we're trying to resolve array[i] in the context of a basic loop invariant 0 <= i && i <= len(array), knowing range constraints about i we could maybe avoid injecting values representing OOB accesses in the value set for array[i] ?

peterschrammel

It seems we are lacking tests in 4 places here. Given that this is all but trivial it would be great to find some test cases that trigger these.

peterschrammel · 2025-06-19T20:55:10Z

src/goto-symex/shadow_memory_util.cpp

@@ -981,7 +981,7 @@ normalize(const object_descriptor_exprt &expr, const namespacet &ns)
  {
    return expr;
  }
-  if(expr.offset().id() == ID_unknown)
+  if(!expr.offset().is_constant())


To root object + constant offset, as object can be an arbitrary access path into the object. Pointer equality checks become trivial then - maybe simplify_expr has become good enough in the meanwhile.
Anyways, this seems orthogonal to this PR.

peterschrammel · 2025-06-19T21:00:56Z

src/pointer-analysis/value_set.cpp

@@ -362,7 +361,8 @@ bool value_sett::eval_pointer_offset(
        if(!ptr_offset.has_value())
          return false;

-        *ptr_offset += *it->second;
+        *ptr_offset +=


Codecov says that 361-368 are not covered by any tests.

peterschrammel · 2025-06-19T21:03:18Z

src/pointer-analysis/value_set.cpp

-            if(!i.has_value())
-              i = mp_integer{0};
-            i = *i + *offset;
+            additional_offset = plus_exprt{


Codecov says that 731-733 are not covered by any tests.

peterschrammel · 2025-06-19T21:03:55Z

src/pointer-analysis/value_set.cpp

        }
        else
        {
-          *i *= *size;
+          additional_offset = mult_exprt{
+            *additional_offset, from_integer(*size, additional_offset->type())};

          if(expr.id()==ID_minus)


Codecov says that 758-766 are not covered by any tests.

peterschrammel · 2025-06-19T21:05:17Z

src/pointer-analysis/value_set.cpp

        {
          auto size = pointer_offset_size(array_type.element_type(), ns);

          if(!size.has_value() || *size == 0)
            o.reset();
          else
-            *o = *i * (*size);
+          {


Codecov says that 1416-1431 are not covered by any tests.

tautschnig · 2025-06-20T20:08:21Z

I have one remaining question: Now that we have symbolic offsets for pointer expressions instead of just "constants" or "unknown", is there ay way to use that to compute more precise results in get_value_set_rec ? I know it lets us be more precise when modelling assignments, but I don't understand why we don't have a similar gain in precision when computing dereferences/traversing value_sets.

I have already seen this happen, although it isn't necessarily very obvious unless one starts examining the formula that symex produces. #8653 is a consequence of my observations: I was surprised to still find "unknown" when I had expected a known offset

Other question: now that the value set representation "knows" that an expression array[i] has an offset of the form i * sizeof(T) could we try to take into account extra constraints about i during the value set traversal ? Let's say we're trying to resolve array[i] in the context of a basic loop invariant 0 <= i && i <= len(array), knowing range constraints about i we could maybe avoid injecting values representing OOB accesses in the value set for array[i] ?

I'm not sure we even do create those OOB values here?

We can safely track arbitrary expressions as pointer offsets rather than limit ourselves to just constant offsets (and then treating all other expressions as "unknown").

tautschnig assigned peterschrammel May 30, 2025

tautschnig requested review from martin-cs and peterschrammel as code owners May 30, 2025 13:30

tautschnig self-assigned this May 30, 2025

tautschnig mentioned this pull request Jun 2, 2025

Huge SMT file and slow proof for simple array function #8617

Open

tautschnig force-pushed the value-set-offset branch from 9764fbe to 5d136c4 Compare June 3, 2025 08:42

tautschnig requested a review from kroening as a code owner June 3, 2025 08:42

tautschnig force-pushed the value-set-offset branch from 5d136c4 to 41811ba Compare June 3, 2025 09:59

tautschnig assigned kroening and unassigned tautschnig Jun 3, 2025

remi-delmas-3000 reviewed Jun 18, 2025

View reviewed changes

src/goto-symex/goto_symex_state.cpp Show resolved Hide resolved

remi-delmas-3000 reviewed Jun 18, 2025

View reviewed changes

remi-delmas-3000 approved these changes Jun 18, 2025

View reviewed changes

peterschrammel approved these changes Jun 19, 2025

View reviewed changes

tautschnig assigned tautschnig and unassigned kroening and peterschrammel Jun 20, 2025

Value set: lift offset from numeric constants to expressions

96844b5

We can safely track arbitrary expressions as pointer offsets rather than limit ourselves to just constant offsets (and then treating all other expressions as "unknown").

tautschnig force-pushed the value-set-offset branch from 41811ba to 96844b5 Compare June 20, 2025 22:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Value set: lift offset from numeric constants to expressions #8647

Value set: lift offset from numeric constants to expressions #8647

Uh oh!

tautschnig commented May 30, 2025

Uh oh!

codecov bot commented Jun 3, 2025 •

edited

Loading

Uh oh!

Uh oh!

remi-delmas-3000 Jun 18, 2025 •

edited

Loading

Uh oh!

peterschrammel Jun 19, 2025

Uh oh!

remi-delmas-3000 Jun 18, 2025 •

edited

Loading

Uh oh!

tautschnig Jun 20, 2025

Uh oh!

remi-delmas-3000 left a comment •

edited

Loading

Uh oh!

peterschrammel left a comment

Uh oh!

peterschrammel Jun 19, 2025

Uh oh!

peterschrammel Jun 19, 2025

Uh oh!

peterschrammel Jun 19, 2025

Uh oh!

peterschrammel Jun 19, 2025

Uh oh!

peterschrammel Jun 19, 2025

Uh oh!

tautschnig commented Jun 20, 2025

Uh oh!

Uh oh!

Value set: lift offset from numeric constants to expressions #8647

Are you sure you want to change the base?

Value set: lift offset from numeric constants to expressions #8647

Uh oh!

Conversation

tautschnig commented May 30, 2025

Uh oh!

codecov bot commented Jun 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

remi-delmas-3000 Jun 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

peterschrammel Jun 19, 2025

Choose a reason for hiding this comment

Uh oh!

remi-delmas-3000 Jun 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tautschnig Jun 20, 2025

Choose a reason for hiding this comment

Uh oh!

remi-delmas-3000 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

peterschrammel left a comment

Choose a reason for hiding this comment

Uh oh!

peterschrammel Jun 19, 2025

Choose a reason for hiding this comment

Uh oh!

peterschrammel Jun 19, 2025

Choose a reason for hiding this comment

Uh oh!

peterschrammel Jun 19, 2025

Choose a reason for hiding this comment

Uh oh!

peterschrammel Jun 19, 2025

Choose a reason for hiding this comment

Uh oh!

peterschrammel Jun 19, 2025

Choose a reason for hiding this comment

Uh oh!

tautschnig commented Jun 20, 2025

Uh oh!

Uh oh!

codecov bot commented Jun 3, 2025 •

edited

Loading

remi-delmas-3000 Jun 18, 2025 •

edited

Loading

remi-delmas-3000 Jun 18, 2025 •

edited

Loading

remi-delmas-3000 left a comment •

edited

Loading